Skip to content
This repository has been archived by the owner on Nov 9, 2019. It is now read-only.

Implement fd_filestat_get for all platforms #42

Merged
merged 7 commits into from
Jul 26, 2019

Conversation

marmistrz
Copy link
Member

@marmistrz marmistrz commented Jul 24, 2019

Current issues/doubts about this PR:

  • On Windows, we make multiple syscalls for every call of fd_filestat_get. Anyway, so as to implement this function efficiently, we need to move a large part of the functionality to libstd, so I prefer code cleanliness over performance for now in this change
  • I'm not sure what to do with integer overflows/failed conversions between various types (currently I'm using try_into and panic if the conversion fails)
  • file_allocate fails on Windows for no clear reason: http://paste.debian.net/1092902/ (this paste will expire in 90d) and it seems not to be connected with my change - it looks like the tests crashes during https:/CraneStation/wasi-misc-tests/blob/9f8c648fb14afc1ed9d2effa2567870d65cbc902/src/bin/file_allocate.rs#L91

@kubkon
Copy link
Member

kubkon commented Jul 24, 2019

Thanks! I've not had a chance to look at it in-detail just yet, but a couple of high-level comments:

  • we've moved away from panics entirely, and would prefer to return an appropriate host::__wasi_errno_t instead; for instance, for overflows I'd suggest to throw a host::__WASI_EOVERFLOW error while for failed conversions host::__WASI_EILSEQ or host::__WASI_EINVAL
  • as far as the failed testcases are concerned, you can analyse the trace of syscalls in-detail with the following command:
RUST_LOG=wasi_common=trace cargo test file_allocate

In particular, for file_allocate, it is expected to fail on Windows until #41 and in particular path_unlink_file is implemented :-)

build.rs Outdated Show resolved Hide resolved
use std::time::{SystemTime, UNIX_EPOCH};
fn timestamp(st: SystemTime) -> io::Result<u64> {
st.duration_since(UNIX_EPOCH)
.map_err(|e| io::Error::new(io::ErrorKind::Other, e.to_string()))?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of mapping into catchall io::ErrorKind::Other here, wouldn't it be better to map directly to host::__WASI_EINVAL for example?

Copy link
Member Author

@marmistrz marmistrz Jul 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm returning io::Result in fd_filestat_get_impl, so I can't. I could probably introduce a new enum to contain either the WASI error or std::io::Error. Or add a convenience helper function to convert std::io::Error to the WASI error so that I can just map_err

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But could you not return our Result type immediately without introducing a new enum nor a convenience helper? Then, I feel like we would retain finer grain control over the error values. For example, with the current approach, regardless of where the error happens along the way, we'll return a io::ErrorKind::Other which will get mapped to host::__WASI_EIO, whereas if had our Result type returned, then st.duration_since(..) could return an host::__WASI_EINVAL while later on, on try_into() you'd map to host::__WASI_EILSEQ. Also, I see that only file.metadata() returns an io::Result, and your helper fns could be rewritten to return our Result type for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in a future PR we should refactor all of the map_err by introducing another enum type or something, but one step at a time :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kubkon most of the libstd function return io::Result and I'd like to (1) avoid repeating e.raw_os_error().map_or(host::__WASI_EIO, errno_from_host) over and over (2) be able to debug! print the io::Error in case it's ErrorKind::Other (sometimes returned by libstd).

io::Error is returned by: metadata, accessed, modified. metadata is also extensively used by all the helper functions on Unix, when using Windows syscalls I also return an io::Error from last OS error.

On the other hand, all the wasi errors appear in the platform-independent fd_filestat_get_impl. Hence, IMO the platform dependent code should return io::Result and the platform-independent code should convert them to wasi error types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share @kubkon's concern about retaining fine-grained error values. Rust's ErrorKind doesn't have values for all the error codes that we use, such as EOVERFLOW and EILSEQ.

In the case of duration_since, if I read the documentation correctly, it'll fail here if given a time before the UNIX epoch. My interpretation is that EOVERFLOW is the most appropriate error code for this; eg. POSIX says "[EOVERFLOW]
A value to be stored would overflow one of the members of the stat structure." So we'd like to be able to return that rather than EIO.

.as_nanos()
.try_into()
.map_err(|e: std::num::TryFromIntError| {
io::Error::new(io::ErrorKind::Other, e.to_string())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, as above :-)

st_ino: hostcalls_impl::file_serial_no(file)?,
st_nlink: hostcalls_impl::num_hardlinks(file)?
.try_into()
.expect("overflow"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we should probably return host::__WASI_EOVERFLOW.

winx::file::change_time(file)
}

pub(crate) fn filetype(file: &File) -> io::Result<host::__wasi_filetype_t> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, should we perhaps augment this function with winx::file::get_file_type for now? This function allows you to match for __WASI_FILETYPE_CHARACTER_DEVICE and __WASI_FILETYPE_SOCKET_STREAM. Also, now that I look at this function here, it would be good if we could re-use in sys::windows::fdentry_impl::determine_type_rights.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, it might not make that much sense if we are specifically expecting a File especially on Windows 🤔

Ok(file.metadata()?.ino())
}

pub(crate) fn filetype(file: &File) -> io::Result<host::__wasi_filetype_t> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we re-use this function in sys::unix::fdentry_impl::determine_type_rights?

@kubkon
Copy link
Member

kubkon commented Jul 24, 2019

cc @sunfishcode @alexcrichton

pub(crate) fn fd_filestat_get(fd: &File) -> Result<host::__wasi_filestat_t> {
unimplemented!("fd_filestat_get")
pub(crate) fn num_hardlinks(file: &File) -> io::Result<u64> {
Ok(winx::file::get_fileinfo(file)?.nNumberOfLinks.into())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of like the Unix implementation, could this operate on fs::Metadata or some similar structure which is only created once instead of being created multiple times?

@marmistrz
Copy link
Member Author

marmistrz commented Jul 25, 2019

I modified the code to return proper error codes. But now I'm wondering if it makes any sense to have a separate fd_filestat_get_impl - I made it a separate function because it was meant to return an io::Error

@alexcrichton
Copy link
Collaborator

Looking more at this I think it may be best actually to have the original organization where there's a totally platform-specific implementation for both Windows and non-Windows. On non-Windows it can be reimplemented to use file.metadata(), but on Windows I think it probably wants to execute GetFileInformationByHandle only once and then use information from there.

Looking at Rust's libstd there's two ways to acquire a fs::Metadata, one from DirEntry and the other from fs::metadata. The two sources have different bits of data on them and fs::Metadata is the intersection of the data. We probably want to add an extension trait for accessing fs::Metadata extra information when it's learned through fs::Metadata to avoid having to reproduce all of the information here.

@marmistrz
Copy link
Member Author

marmistrz commented Jul 25, 2019

You're right.
fs::metadata has information taken from GetFileInformationByHandle and DirEntry from FindNextFileW.

I'll do as you suggest.

Looking at Rust's libstd there's two ways to acquire a fs::Metadata, one from DirEntry and the other from fs::metadata. The two sources have different bits of data on them and fs::Metadata is the intersection of the data. We probably want to add an extension trait for accessing fs::Metadata extra information when it's learned through fs::Metadata to avoid having to reproduce all of the information here.

We could extend std::os::windows::fs::MetadataExt and return an Option<_>, and switch to the extension of MetadataExt when it's stabilized

This will still need two syscalls on Windows: GetFileInformationByHandle once & GetFileInformationByHandleEx once but I doubt we can avoid it. I'll recheck tomorrow if we can (UTC+2)

@alexcrichton
Copy link
Collaborator

@marmistrz
Copy link
Member Author

@alexcrichton I thought I was going to have my first libstd pull request :P
LGTM :)

@alexcrichton
Copy link
Collaborator

Oh oops my bad! I certainly don't want to steal any thunder by any means

@kubkon
Copy link
Member

kubkon commented Jul 25, 2019

Hahaha, @alexcrichton @marmistrz love the banter! You guys are the best :-D

@kubkon
Copy link
Member

kubkon commented Jul 26, 2019

LGTM! @sunfishcode @marmistrz do you guys have any other comments you'd like addressed?

@marmistrz
Copy link
Member Author

Could we extend

pub type __wasi_linkcount_t = u32;

to u64 so that there is no overflow possible in nlink()?

@kubkon
Copy link
Member

kubkon commented Jul 26, 2019

@marmistrz AFAIK that might not be that simple as it’d require changing the WASI spec. I’m sure that @sunfishcode will be able to shed some light on this.

@marmistrz
Copy link
Member Author

marmistrz commented Jul 26, 2019

I rebased and refactored the change as @alexcrichton suggested. I left over a duplicate convert_err helper function for now.
I can already see I'm going to use such a helper function a lot (I need it for my current draft of fd_filestat_set_size, it would also be useful if we implemented fd_filestat_set_times using the platform-agnostic filetime crate), so I have the following ideas, in increasing order of complexity.

  1. Add errno_from_io_error which is std::io::Error -> host::__wasi_errno_t to crate::sys which would correspond to the current convert_err. In particular, I find it important that we print debug! information if some information about the error is lost.
  2. Create a new type
enum Error {
     RawError(host::__wasi_errno_t),
     OsError(std::io::Error),
}

add a conversion method and handle the conversion as a part of the hostcalls macro. Also, From<std::io::Error>
3. As in two, but swap the inner host::__wasi_errno_t for a new type, say, WASIError which would implement TryFrom<host::__wasi_errno_t>. Then it would be possible to have impl From<WASIError> for Error

2 or 3 would have to be done as a subsequent pull request. What do you think?

@marmistrz
Copy link
Member Author

The CI failure is unrelated to the changes in this PR, master is currently failing.

@kubkon
Copy link
Member

kubkon commented Jul 26, 2019

The CI failure is unrelated to the changes in this PR, master is currently failing.

I'm on it, it should be fixed in #45.

@kubkon
Copy link
Member

kubkon commented Jul 26, 2019

I rebased and refactored the change as @alexcrichton suggested. I left over a duplicate convert_err helper function for now.
I can already see I'm going to use such a helper function a lot (I need it for my current draft of fd_filestat_set_size, it would also be useful if we implemented fd_filestat_set_times using the platform-agnostic filetime crate), so I have the following ideas, in increasing order of complexity.

1. Add `errno_from_io_error` which is `std::io::Error -> host::__wasi_errno_t` to `crate::sys` which would correspond to the current `convert_err`. In particular, I find it important that we print `debug!` information if some information about the error is lost.

2. Create a new type
enum Error {
     RawError(host::__wasi_errno_t),
     OsError(std::io::Error),
}

add a conversion method and handle the conversion as a part of the hostcalls macro. Also, From<std::io::Error>
3. As in two, but swap the inner host::__wasi_errno_t for a new type, say, WASIError which would implement TryFrom<host::__wasi_errno_t>. Then it would be possible to have impl From<WASIError> for Error

2 or 3 would have to be done as a subsequent pull request. What do you think?

Awesome, thanks! I like the way this is going, and temporarily I'd focus on option 1. for now until we land a working Windows implementation, which for me (can't speak for anyone else here) is a must-have and a priority. Afterwards, I'm happy to refactor in the most efficient/elegant way possible :-)

@marmistrz
Copy link
Member Author

marmistrz commented Jul 26, 2019

I refactored the error handling, this should be ready. In a subsequent PR I'll reuse the errno_from_ioerror as much as possible so as to remove the code duplication.

@kubkon
Copy link
Member

kubkon commented Jul 26, 2019

I've now merged a fix for the CI in #45 so if you could rebase and check that everything tests out fine for you :-)

@marmistrz
Copy link
Member Author

marmistrz commented Jul 26, 2019

Rebased, thanks

@sunfishcode
Copy link
Member

@marmistrz Yes, extending __wasi_linkcount_t to 64-bit makes sense to me, though we will indeed need a spec change, and it'll be an ABI change, both of which we can do, it'll just take some coordination. I filed WebAssembly/WASI#70 to track this.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jul 26, 2019
This commit adds accessors for more fields in `fs::Metadata` on Windows
which weren't previously exposed. There's two sources of `fs::Metadata`
on Windows currently, one from `DirEntry` and one from a file itself.
These two sources of information don't actually have the same set of
fields exposed in their stat information, however. To handle this the
platform-specific accessors of Windows-specific information all return
`Option` to return `None` in the case a metadata comes from a
`DirEntry`, but they're guaranteed to return `Some` if it comes from a
file itself.

This is motivated by some changes in CraneStation/wasi-common#42, and
I'm curious how others feel about this platform-specific functionality!
Centril added a commit to Centril/rust that referenced this pull request Jul 26, 2019
…ackler

std: Add more accessors for `Metadata` on Windows

This commit adds accessors for more fields in `fs::Metadata` on Windows
which weren't previously exposed. There's two sources of `fs::Metadata`
on Windows currently, one from `DirEntry` and one from a file itself.
These two sources of information don't actually have the same set of
fields exposed in their stat information, however. To handle this the
platform-specific accessors of Windows-specific information all return
`Option` to return `None` in the case a metadata comes from a
`DirEntry`, but they're guaranteed to return `Some` if it comes from a
file itself.

This is motivated by some changes in CraneStation/wasi-common#42, and
I'm curious how others feel about this platform-specific functionality!
@kubkon kubkon merged commit 89fbde2 into CraneStation:master Jul 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants